Value Iteration Working With Belief Subsets

نویسندگان

  • Weihong Zhang
  • Nevin L. Zhang
چکیده

Value iteration is a popular algorithm for solving POMDPs. However, it is inefficient in practice. The primary reason is that it needs to conduct value updates for all the belief states in the (continuous) belief space. In this paper, we study value iteration working with a subset of the belief space, i.e., it conducts value updates only for belief states in the subset. We present a way to select belief subset and describe an algorithm to conduct value iteration over the selected subset. The algorithm is attractive in that it works with belief subset but also retains the quality of the generated values. Given a POMDP, we show how to a priori determine whether the selected subset is a proper subset of belief space. If this is the case, the algorithm carries the advantages of representation in space and efficiency in time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Restricted Value Iteration: Theory and Algorithms

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is inefficient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-...

متن کامل

Impact of reconstruction method on quantitative parameters of 99mTc-TRODAT-1 SPECT

Introduction: Quantitative evaluation is recommended to improve diagnostic ability and serial assessment of dopamine transporter (DAT) density scans. We decided to compare the ordered subsets expectation-maximization (OSEM) with filtered back-projection (FBP), and to investigate the impact of different iteration and cut-off frequencies on SBR values. Methods</stro...

متن کامل

Value Iteration over Belief Subspace

Partially Observable Markov Decision Processes (POMDPs) provide an elegant framework for AI planning tasks with uncertainties. Value iteration is a well-known algorithm for solving POMDPs. It is notoriously difficult because at each step it needs to account for every belief state in a continuous space. In this paper, we show that value iteration can be conducted over a subset of belief space. T...

متن کامل

Incremental Least Squares Policy Iteration for POMDPs

We present a new algorithm, incremental least squares policy iteration (ILSPI), for finding the infinite-horizon policy for partially observable Markov decision processes (POMDPs). The ILSPI algorithm computes a basis representation of the value function by minimizing the Bellman residual and it performs policy improvement in reachable belief states. A number of optimal basis functions are dete...

متن کامل

Point-based value iteration: An anytime algorithm for POMDPs

This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of representative belief points, and planning for those only. By using stochastic trajectories to choose belief points, and by maintaining only one value hyperplane per point, it is able to successfully solve large problems, incl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002